We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

Open access books available 5,300

130,000 155M

International authors and editors

Downloads

Our authors are among the

most cited scientists 154 TOP 1%

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

# Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

# **Globally Optimised Energy-Efficient Data Centres Globally Optimised Energy-Efficient Data Centres**

Dirk Pesch, Susan Rea, J. Ignacio Torrens, Vojtech Zavrel, J.L.M. Hensen, Diarmuid Grimes, Barry O'Sullivan, Thomas Scherer, Robert Birke, Lydia Chen, Ton Engbersen, Lara Lopez, Enric Pages, Deepak Mehta, Jacinta Townley and Vassilios Tsachouridis Dirk Pesch, Susan Rea, J. Ignacio Torrens, Vojtech Zavrel, J.L.M. Hensen, Diarmuid Grimes, Barry O'Sullivan, Thomas Scherer, Robert Birke, Lydia Chen, Ton Engbersen, Lara Lopez, Enric Pages, Deepak Mehta, Jacinta Townley and Vassilios Tsachouridis

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/65988

#### **Abstract**

Data centres are part of today's critical information and communication infrastructure, and the majority of business transactions as well as much of our digital life now depend on them. At the same time, data centres are large primary energy consumers, with energy consumed by IT and server room air conditioning equipment and also by general build‐ ing facilities. In many data centres, IT equipment energy and cooling energy require‐ ments are not always coordinated, so energy consumption is not optimised. Most data centres lack an integrated energy management system that jointly optimises and controls all its energy consuming equipments in order to reduce energy consumption and increase the usage of local renewable energy sources. In this chapter, the authors discuss the chal‐ lenges of coordinated energy management in data centres and present a novel scalable, integrated energy management system architecture for data centre wide optimisation. A prototype of the system has been implemented, including joint workload and thermal management algorithms. The control algorithms are evaluated in an accurate simulation‐ based model of a real data centre. Results show significant energy savings potential, in some cases up to 40%, by integrating workload and thermal management.

**Keywords:** energy efficient data centres, workload management, thermal management, integrated data centre energy management platform

### **1. Introduction**

Data centres have become a critical part of modern information technology (IT) infrastruc‐ ture with software as a service, mobile cloud applications, digital media streaming and the

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. Distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/), which permits use, distribution and reproduction for non-commercial purposes, provided the original is properly cited.

expected growth in the Internet of Everything all relying on data centres. However, data cen‐ tres are also significant primary energy users and now consume in the order to 3% of world‐ wide electricity and are responsible for 2% of global greenhouse gas emissions, the same as the airline industry [1]. With the increasing move towards cloud computing and storage as well as everything as a service type computing, data centre energy consumption is currently growing at a compound annual rate of over 10% and expect to reach approximately 8% of global energy consumption by 2020 [2, 3]. While the hyper‐scale data centres of large cloud service providers are consuming in the 10 s of megawatts of power with corresponding annual electricity bills in the order of tens of millions of dollars, for example, Google with over 260 MW and \$67 M and Microsoft with over 150 MW and \$36 M in 2010 [4], those large cloud service providers are also investing heavily in energy efficiency and green data centres, for example, Google and Microsoft have invested over \$900 M in energy reduction measures since 2010. However, smaller operators, independent and co‐location/multi‐tenant data centres have not yet been able to deploy many of the energy efficiency technologies that are available. This is due to lack of integrated technology solutions and uncertainty about costs and the use of renewable energy solutions. In particular, the many server rooms and small data centres run by com‐ mercial businesses and universities are the dominant electricity users as shown in **Figure 1** [5].

On average, computing consumes 60% of total energy in data centres while cooling consumes 35% [6]. New server and cooling technologies have the potential to lead to a 40% reduction of energy consumption, but computation and cooling typically operate without joint coordina‐ tion or optimisation. While server energy management can reduce energy use at CPU, rack and overall data centre level, dynamic computation scheduling is often neither efficient with many idle servers running rather than being shutdown [5] nor is it generally integrated with cooling. Data centre cooling typically operates at constant cold air temperature to protect the hottest server racks, while local fans distribute the air across racks. However, these local

**Figure 1.** Estimated US data centre electricity consumption by market segment (2011) [5].

server controls are typically not integrated with room cooling systems, which means that it is not possible to optimise chillers, air fans and server fans as a single, whole system.

In order to reduce the CO<sup>2</sup> footprint of data centres, large organisations such as Google and Facebook are investing in renewable energy sources (RES), such as solar photovoltaics (PV) or wind power, often co‐located with their hyper‐scale data centres [7, 8]. However, for the many smaller data centres and server rooms, the use or integration of renewable energy sources has received limited interest. The reason for this is that these data centres are typically embedded in buildings that also hold other functions, for example, office and meeting spaces, labora‐ tories and lecture rooms in the case of universities. A major issue in this is also the lack of interoperability of generation, storage and heat recovery and current installation and main‐ tenance costs versus payback [9]. By and large, data centre operators, who want to be green and use renewable energy, buy electricity that has been given a green label by their respective supplier without often being able to fully verify this. The intermittency of renewable energy generation is also a critical factor in an environment with very strict service level agreements and essentially 100% uptime requirements. The adoption of new technologies related to com‐ puting, cooling, generation, energy storage and waste heat recovery individually requires sophisticated controls, but no single manufacturer provides a complete system, so integration between control systems does not exist.

However, research has been under way in a cluster of projects funded by the European Commission's Framework Programme for Research and Innovation. The cluster includes proj‐ ects such as DC4Cities, GENiC, CoolEmAll, RenewIT, Eureca, GEYSER, GreenDataNet, Dolfin and All4Green, which are all focused on a range of aspects to increase data centre energy effi‐ ciency but also to integrate data centre energy use and recovery into a future smart grid and smart city environment. One of those projects, GENiC (http://www.projectgenic.eu), in particu‐ lar, aims at developing integrated cooling and computing control strategies in conjunction with innovative power management concepts that incorporate renewable electrical power supply and storage, and waste heat management. The project's aim is to address the issue mentioned above by developing an integrated, flexible, component‐based management and control plat‐ form for data centre wide optimisation of energy consumption, reduction of carbon emissions and increased local renewable energy supply usage through integrating monitoring and con‐ trol of computation, data storage, cooling, on‐site power generation and waste heat recovery.

A key element in not only achieving a reduction in energy consumption but also a reduction in carbon emissions is energy supply by renewable energy generation and, where possible, energy storage equipment. Such an approach needs to be operated as a complete system to achieve an optimal energy and emissions outcome. This vision of integrated, holistic energy management is centred on the development of a hierarchical control system to operate all of the primary data centre components in an optimal and coordinated manner.

## **2. Challenges for integrated data centre energy management**

While data centres have become a critical IT infrastructure and also a significant consumer of energy and contributor to CO<sup>2</sup> emissions, opportunities exist to enhance the energy and power management of data centres in conjunction with renewable energy generation and integration with their surrounding infrastructure. Work has been done on studying the topic of powering of data centres by renewable energy [10], but this has not been fully integrated into a complete energy management system considering coordinated workload management, cooling, pow‐ ering and heat recovery management. While much work has focused on integrated energy management for data centres [11, 12], there is still a lack of an overall consideration of energy usage and powering with the recovery of waste heat as part of an overall thermal manage‐ ment approach. In order to bring the elements of workload management, cooling, powering and heat recovery together in such a way that it will be possible to achieve a high level of renewable energy powering of data centres, a comprehensive integrated energy management system is needed. The challenges that such a system needs to address are as follows:


In order to achieve this while making sure energy consumption costs do not exceed certain levels, effective monitoring and fault management tools are important and can assist opera‐ tors with their work.

## **3. An architecture for globally optimised energy management in data centre**

To address the challenges outlined above, the EC‐funded GENiC project has developed a high‐level architecture for an integrated design, management and control platform, target‐ ing data centre wide optimisation of energy consumption by encapsulating monitoring and control of IT workload, data centre cooling, local power generation, energy storage and waste heat recovery. The developed management platform includes control and optimisation, deci‐ sion support, and fault detection functions and defines interfaces and common data formats to enable a component‐based design. The GENiC architecture can act as a template for a wide range of implementations of data centre energy management systems suited to a particular data centre configuration. In the following, a functional specification of the GENiC architec‐ ture is presented and an overview of the integration framework is provided. The applicability of the proposed functional architecture is illustrated by a number of use cases. More detail can be found in [13].

#### **3.1. Functional architecture**

The GENiC architecture integrates workload management, thermal management and power management by using a hierarchical control concept that enables the coordination of the management sub‐systems in an optimal manner with respect to the cost of energy consump‐ tion, environmental impact and cost policies. **Figure 2** provides an overview of the developed GENiC system architecture, which consists of six functional groups, the GENiC component groups (GCGs):


**Figure 2.** Overview of the GENiC architecture (from [13]).

• The **Integration Framework GCG** provides the communication infrastructure and data formats that are used for interactions between all components of the GENiC system.

Each GCG is composed of a number of functional components, the GENiC components (GCs) (see **Figure 2**). The core function of the GENiC system for continuous data centre energy opti‐ misation can be divided into four basic steps:


These elements are complemented by components for external data acquisition and fault detection and diagnostics. The basic information flow for coordinating workload, thermal and power management is illustrated in **Figure 3**. In the following, the GENiC component groups are described in more detail.

**Figure 3.** Information flow (simplified) for coordinating workload, thermal and power management [14].

**Workload Management GCG:** The primary objective of this GCG is to allocate virtual machines (VMs) to physical machines (PMs) such that service level objectives (SLOs) are satisfied with low operational cost. Monitoring data from the IT resources deployed within the data centre are collected by the Workload Monitoring GC. The Workload Prediction GC uses this information to provide short‐ and long‐term predictions on resource uti‐ lization. The allocation and migration of VMs to PMs are determined by the Workload Allocation Optimisation GC, which solves a constrained optimisation problem, taking the predicted workload as well as constraints provided by the Supervisory Intelligence GC, Thermal Prediction and Performance Optimisation GC into consideration. The Performance Optimisation GC defines location constraints for individual VMs and modifies the indi‐ vidual VMs' priorities to fulfil application specific SLOs. The VM allocation plan is finally applied by the Workload Actuation GC, which provides an interface to the data centre‐spe‐ cific virtualization platform.

**Thermal Management GCG:** The Thermal & Environment Monitoring GC integrates moni‐ toring of cooling systems and a sensor network infrastructure for collecting temperature and other environmental data in the data centre space. The collected data are used by the Thermal Prediction GC to provide short‐term and long‐term predictions to support supervisory con‐ trol decisions, thermal actuation and workload allocation. Long‐term predictions are used for making decisions at the supervisory level. Short‐term thermal predictions are required by the Thermal Actuation GC along with real‐time sensor measurements to determine optimal set points for the cooling system in order to achieve the targets set by the Supervisory Intelligence GC. These short‐term thermal predictions are also necessary input to the Workload Allocation Optimisation GC, as they include temperature models for the thermal contribution of IT server workload to the server inlets and the Supervisory Intelligence GC. Furthermore, short‐term predictions, combined with equipment fault information from the Thermal Fault Detection & Diagnostics (FDD) GC, are used for fault detection and diagnostics at the supervisory level.

**Power & RES Management GCG:** The Power Monitoring GC provides power monitor‐ ing information of the DC (power consumed per server, per rack level and total DC power demand), as well as integrates monitoring of the RES infrastructure for local energy genera‐ tion and storage with data centre power consumption requirements. These data are used by the Power Prediction GC to provide IT Power prediction as well as long‐term predictions to support supervisory control decisions and power actuation. The Power Actuation GC deter‐ mines operation set points for the power systems based on operation policies provided by the Supervisory Intelligence GC and adjusting them depending on measured data and opera‐ tional conditions.

**Supervision GCG:** The Supervisory Intelligence GC is responsible for the overall coordina‐ tion of workload, thermal, power management and heat recovery. It considers power demand and supply, grid energy price, energy storage availability and determines how much power should be supplied from the electricity grid, RES and energy storage to achieve a particular objective on power usage. To this end, it provides policies for the components in the Workload Management, Thermal Management and Power & RES Management GCGs based on informa‐ tion from monitoring and prediction components. The Supervisory Intelligence GC provides these high‐level policies for the purpose of guiding the individual management functions towards the Supervisory Intelligence objective strategy that has been chosen as the driver for current data centre operations. Key objective choices might be minimization of financial cost, minimization of carbon emissions or maximization of RES usage. To detect and diagnose sys‐ tem anomalies, the Supervisory FDD GC compares predicted values with measurement data and collects and evaluates fault information. In appropriate situations, the Supervisory FDD GC informs the Supervisory Intelligence GC when a deviation becomes substantial enough to negatively impact system operation so that mitigation action can be taken by the platform until the fault has been corrected. The Human‐Machine Interface GC provides a framework for user interfaces that allow data centre operators to monitor and evaluate aggregated data provided by the individual GCs.

**Integration Framework GCG:** The Communication Middleware GC provides the commu‐ nication infrastructure used within the GENiC platform. The Data Centre Configuration GC uses a centralized data repository to store all information related to the data centre configura‐ tion, including information on data centre layout, cooling equipment, monitoring infrastruc‐ ture, IT equipment and virtual machines running in the data centre. Finally, the External Data Acquisition GC provides access to data not collected by existing components of the GENiC platform, including weather data, grid energy prices and grid energy CO<sup>2</sup> indicators.

The GENiC platform integrates distributed software components, which are developed and maintained by individual consortium partners. A software component can implement a single GC, multiple GCs or just part of a GC to provide the required functionality to the platform. For example, a topic‐based publish‐subscribe messaging architecture is a suitable mechanism to ensure a robust data exchange between individual software components. With this approach, the components do not need to be connected directly to each other, but components can pub‐ lish messages to a central message broker using pre‐defined topics and subscribe to the broker to topics from other components that are of interest to them. The broker forwards all incom‐ ing messages to the appropriate subscribers. The GENiC architecture defines a consistent interface specification using a common data format for all GENiC components. All interfaces are defined by hierarchically structured topics. Each of these topics has a defined message payload structure that uses the GENiC common data exchange format which is specified based on JSON [15]. This approach creates a very flexible data centre management platform that can be configured to suit individual, local data centre configurations.

**Support Tools GCG:** The GENiC platform includes a number of tools to assist data centre planners, system integrators and data centre operators:


#### **3.2. Energy management use case**

The GENiC project's focus to optimally operate data centres with respect to energy is achieved through the integration of workload management, thermal management and power manage‐ ment (including powering through renewable energy sources) via a hierarchical supervisory control concept. Key optimisation criteria in consideration by data centre operators are (i) meeting agreed service level agreements (SLAs), (ii) minimisation of total energy costs, and (iii) with the availability of renewable energy sources also, the maximisation of RES power use and minimisation of carbon emissions. To account for fluctuations in the IT workload demand and the availability of renewable energy supply (which includes local on‐site energy production and grid power), the set points of the management sub‐systems have to be adapted over time. The Supervisory Intelligence (SI) GC coordinates the individual manage‐ ment sub‐systems, including renewable energy supply, by providing optimal policies with respect to the selected optimisation criterion. The use case scenario is illustrated in **Figure 4**. The basic operational flow is as follows [14]:

**Step 1—**The monitoring GCs, Workload Monitoring, Thermal & Environment Monitoring, and Power Monitoring, collect data from VMs, PMs, air conditioning equipment, sensor net‐ works, power meters and on‐site energy supply systems. The relevant information is for‐ warded to the individual prediction and actuation GCs and SI.

**Step 2—**Based on recent and historical monitoring data, the prediction GCs, Workload Prediction, Thermal Prediction, and Power Prediction, predict server power demand, ther‐ mal profile and cooling demand, RES production capacity and energy demand. The relevant information is forwarded to the individual actuation GCs and SI.

**Step 3—**Additional data, that is, weather data and grid energy prices, are obtained from external data sources and forwarded to SI by the External Data Acquisition GC.

**Step 4—**SI provides a set of policies to the actuation GCs, Workload Allocation Optimisation, Thermal Actuation and Power Actuation that are based on inputs from the monitoring and prediction components and further interactions with the Power Prediction GC. These interac‐ tions validate the consequences of particular power profiles that SI considers as part of the policy definition. The Workload Allocation Optimisation GC solves a constrained optimisation

**Figure 4.** Energy management use case [14].

problem to determine an optimal VM allocation plan minimizing server energy consumption, taking the upper‐bound IT power budget recommended by SI and additional inputs from other GCs (thermal and colocation and anti‐colocation constraints) into consideration. The Thermal Actuation GC takes the minimum and maximum allowable data centre temperatures determined and then provided to it by SI and optimally calculates cooling equipment set points that ensure the room's thermal profile is properly regulated with minimal cooling equipment electrical power consumption. The Power Actuation GC implements the distribution plan for drawing electricity from grid, controllable and uncontrollable RES, and the schedule for charg‐ ing and discharging the energy storage device.

**Step 5—**Based on the inputs from SI and the Workload Allocation Optimisation GC, as well as monitoring and prediction components, the actuation GCs, Workload Actuation, Thermal Actuation, and Power Actuation, decide and apply the actual control actions. For example, the Workload Actuation GC executes the VM allocation plan and switches PMs on/off, based on the actuation requests. Faults are reported back to the optimisation GCs to be considered in the next iteration of the optimisation process.

## **4. Prototype implementation**

**Figure 5** illustrates a prototype implementation of the GENiC architecture. The GENiC dis‐ tributed architecture approach with clearly defined interfaces simplifies integration of a diverse set of software components and allows flexible configuration of the platform. Due to the diverse set of technologies in use in data centres, for example, IT systems, cooling systems, power systems and RES facilities, there is typically no individual manufacturer who sup‐ plies all the systems that a data centre requires. Therefore, a data centre management system architecture needs to allow for the integration of individual components supplied by multiple manufacturers and service providers. The architecture detailed in Section 3 is scalable and flexible at the same time and is based on micro‐service architecture principles that offer the following benefits:


• **Simplified testing and integration—**testing and integration are easier as testing focuses on black box testing with implementation details hidden behind APIs. Service integration hides APIs and dependencies.

A central element of the implementation of the prototype is the use of the RabbitMQ messag‐ ing system [16] for the exchange broker. RabbitMQ provides a range of client implementa‐ tions in a wide range of programming languages, which allows manufacturers to suit their individual technology set‐ups. A Generic Client architecture has been developed to allow each component provider expose their components in a distributed manner in the architec‐ ture. The individual GENiC components are implemented as services that communicate via the message broker. The client architecture also offers an easy way to integrate 3rd party (closed source) services with a minimal effort. Each of the components implemented in the GENiC prototype are shown in **Figure 5**, colour coded based on the component group they belong to. Short‐term monitored data are stored in a database backend in the GENiC pro‐ totype implementation. CouchDB as a NoSQL solution is used, but many other data base solutions are possible depending on the specific needs and data volumes of a particular configuration. Due to the large quantity of stored data, only short‐term data are available on the broker.

**Figure 5.** GENiC architecture implementation prototype.

## **5. Assessment of energy efficiency**

In order to assess the effectiveness of data centre management systems in terms of the energy efficiency, power management, managing increased penetration of renewable energy sources, heat reuse and data centre flexibility, the need to select appropriate metrics is of paramount importance. The aforementioned cluster of European research projects on data centre energy efficiency has taken five common data centre metrics and defined 21 new metrics, along with measurement methodologies, to adequately capture the energy efficiency, flexibility and sus‐ tainability of modern data centres [17]. This approach supports the development of a common framework for monitoring and assessing the flexibility and sustainability of data centres. The metrics of specific interest for the evaluation of an integrated energy management platform, which integrates thermal and workload management with renewable energy/power supply and heat recovery, are listed in **Table 1**.

The GENiC project considers two types of evaluation: one is based on simulation‐based assessment (SBA), which uses the Simulators GENiC component (see **Figure 2**), provided by the tools that have been developed in the project. The Simulators component provides a virtual data centre based on TRNSYS model implementation and simulation and additional interfacing and timing functions [18]. The SBA uses the full energy management platform in the same manner as it is used in a real physical data centre. SBA has the advantage that a spe‐ cific architecture configuration can be tuned to a particular data centre set‐up before deploy‐ ment in the real environment. This allows for a priori energy efficiency assessment, which not only enables data centre operators to understand what energy savings can be expected from a deployment of an integrated data centre energy and power management platform, but also prepares the platform to run optimally once deployed without affecting the real environment during an in situ tuning process.

**Table 1.** GENiC evaluation metrics.

The second evaluation is based on the deployment of the prototype in a real data centre. The project chose a small but typical data centre at Cork Institute of Technology. The data centre was adapted to the needs of the project to enable extensive control of the thermal manage‐ ment side, including heat recovery and both virtualisation of the computing infrastructure and normal operation. Experimental renewable energy facilities are linked in a virtual man‐ ner to the data centre as the renewable energy micro‐grids are located on two premises of project partner Acciona in Spain. The demonstration of use of renewable energy is possible by recording the amount of energy that can be generated by typical micro‐grids over time and accounting the amount of electricity flowing into the data centre as either non‐renewable or renewable.

#### **5.1. Simulation model—virtual C130 data centre**

In order to evaluate the performance of the GENiC platform and to allow pre‐deployment assessment and tuning, the project has developed a Simulators GC, which is part of the Support Tools GCG. The simulator component includes energy models that emulate the performance of a data centre and its systems, supporting the development and testing of GENiC components as well as the commissioning of the overall GENiC platform, prior to its physical deployment to the real data centre [19]. The Simulators GC consists of energy models shown in **Figure 6**. These are on the demand side, for example, data centre environ‐ ment (building energy model and building airflow model), IT devices model, and heating, ventilation and air conditioning (HVAC) systems model, and the supply side, for example, power supply model.

**Figure 6.** Types of energy models in the Simulator GC.

**Figure 7.** Floor plan of the data centre room used for the simulation‐based assessment.

In order to demonstrate the functionality and feasibility of this approach, the Simulator GC implements a virtual data centre model that is based on the actual GENiC demonstration site, the C130 data centre at Cork Institute of Technology. The data centre space is cooled by one main computer room air conditioning unit (CRAC) and one backup air conditioning unit (AC) as illustrated in the floor plan depicted in **Figure 7**.

#### **5.2. IT equipment and DC whitespace characteristics**

To emulate the server workload in the data centre, a set of virtual machine (VM) configura‐ tions and the VMs' resource utilization traces are required. The traces used for the evaluation example presented here have been collected from a typical corporate data centre production environment and reflect typical enterprise workloads seen in a private cloud environment. The traces comprise resource utilization data for 2400 different VMs hosted on 132 servers. The key parameters of these servers are summarized in **Table 2**. The last column shows the num‐ ber of servers of each specific type. Each server's dynamic power consumption is modelled as

$$P\_{server} = \left(P\_{\max} - P\_{idle}\right) \times \mu + P\_{idle}$$

where *u* is the CPU utilization, *P*max is the server's power consumption at full load (i.e. *u* = 1.0), and *P*idle is the server's power consumption at idle state (i.e. *u* = 0.0). The total power consump‐ tion of the 132 servers is 24.5 kW if all servers operate at full load.

For the simulation‐based evaluation example, each server has been mapped to a specific rack space in the simulated data centre. **Table 3** shows this mapping.


**Table 2.** Server parameters.


**Table 3.** Mapping of servers to racks in the virtual data centre.

#### **5.3. Cooling system characteristics**

The environment of the data centre is maintained at temperatures between 18 and 27°C with a relative humidity of 30–60% as recommended by ASHRAE [20]. The CRAC unit ensures the required indoor climate. Supply air is distributed through a raised floor and goes to front side of IT devices through perforated tiles. Return air is drawn by the CRAC unit below the ceiling as shown in **Figure 8**.

The conditions of circulating air are controlled in the CRAC unit by a direct expansion sys‐ tem. A condenser coil of the direct expansion system is cooled by glycol, and heat is rejected to the external ambient environment in a roof‐mounted dry cooler. The process and devices involved are depicted in **Figure 9**.

There is also an auxiliary floor standing air conditioning (AC) unit placed in the room, as shown in **Figure 10**.

**Figure 8.** Schematic of hot and cold aisle arrangements without containments.

**Figure 9.** Main cooling system.

**Figure 10.** Auxiliary air conditioning unit.

## **6. Simulation‐based assessment of energy management**

The simulation‐based evaluation of the GENiC energy management (EM) platform tests the interaction of short‐term (S‐T) actuation and long‐term (L‐T) decision‐making on the virtual C130 data centre test‐bed that replicates the physical processes occurring in the real data cen‐ tre facility. This interaction and the components involved are shown in **Figure 11**.

A key component in all evaluations reported in this paper (and shown in **Figure 10** via the arrows between components) is the Communication Middleware GC, which provides the

**Figure 11.** Interaction between EM platform GENiC components and virtual DC test‐bed.

glue between all the different GENiC components and enables message exchange between components via the RabbitMQ broker (see above). The details of which components are rel‐ evant to a particular evaluation are discussed in the following.

#### **6.1. Boundary conditions for the simulation‐based assessment**

All use cases are tested based on identical boundary conditions so that the different operating strategies can be compared to each other. The following external factors are considered as boundary conditions:


#### **6.2. Workload management GCG**

The evaluation of the Workload Allocation Optimisation GC algorithms used within the GENiC prototype implementation was evaluated under the following scenarios (experiments):


The experiment with VM migration limits refers to the evaluation of Workload Allocation Optimisation GC with different values for the maximum number of VM migrations allowed per time period. The evaluation with thermal preferences refers to the testing of Workload Allocation Optimisation GC considering a static thermal server preference when perform‐ ing server consolidation. This experiment represents a thermal‐aware workload allocation strategy [21]. The workload allocation experiment assesses the performance of the Workload Allocation Optimisation GC when it considers thermal actuation preferences. For the simula‐ tion‐based evaluation, a static thermal preference matrix for each of the servers was devel‐ oped based on Supply Heat Index (SHI) analysis [22] of the C130 data centre white space from the baseline inputs.

These scenarios were compared against each other and against a baseline allocation strategy. This comparison is assessed based on (i) the thermal behaviour in the white space (e.g. tem‐ perature distribution, hot spots) and (ii) energy consumption

#### *6.2.1. GENiC components involved and testing process*

The GENiC components involved in this particular workload management evaluation exam‐ ple are a subset of those that form the overall Workload Management GCG. This particular subset was chosen here to demonstrate the feasibility of the approach and demonstrate the overall system in operation. The experiments for this evaluation follow these steps:


The Simulators GC captures all the data relevant to this process for analysis and post‐pro‐ cessing. The focus of this evaluation is to analyse the influence of workload allocation strate‐ gies on the temperature distribution of the white space as well as on the total DC energy consumption.

#### **6.3. Thermal management GCG**

Further experiments target the evaluation of the Thermal Management GCG algorithms with optimal thermal actuation. In this scenario, the GENiC prototype implementation is evalu‐ ated against a baseline operation strategy. This comparison is assessed based on data centre energy consumption and white space temperature distribution.

#### *6.3.1. GENiC components involved and testing process*

The GCs involved in this thermal management evaluation are a subset of those that form the Thermal Management GGCs. The subset chosen aligns with the requirements of the par‐ ticular data centre demonstration site, and other, larger data centre configurations may use a broader spectrum of functionality. The experiments for the thermal management evaluation follow these steps:


The Simulators GC captures all the data relevant to this process for analysis and post‐pro‐ cessing. The focus of this evaluation is to analyse the influence of S‐T prediction and thermal actuation strategies developed in the project on the temperature distribution of the white space as well as on the total DC energy consumption.

#### **6.4. Power management GCG**

In order to evaluate the power management aspects of the GENiC prototype platform, experiments were executed to evaluate the Power Management GCG algorithms under the following scenarios: (i) Power Actuation Logic, and (ii) Power Actuation Logic + SI static constraints. These scenarios are compared against each other and against the baseline opera‐ tion. This comparison is assessed based on energy demand versus supply (broken down per source).

#### *6.4.1. GENiC components involved and testing process*

The GCs involved in this power management evaluation are a subset of those that form the Power Management GGCs and are selected to reflect the specific situation prevalent in the demonstration site. Elements of the power systems micro‐grid available to the project, includ‐ ing a battery bank and an Organic Rankine Cycle (ORC), were modelled and included in this evaluation. The experiments for the Power Management evaluation follow these steps:


The Simulators GC captures all the data relevant to this process for analysis and post‐ processing. The focus of these experiments is it to analyse the power actuation operation strategies to satisfy the total DC demand. The power actuation real‐time adjustments are defined so as to assure the renewable energy supply contribution. This is achieved through balancing the lack or excess of weather‐dependent generation by using a controllable unit characterized with "unlimited" energy (kWh) capacity, which in this case is the ORC. The ORC has an unlimited energy capacity if the biomass storage is continuously refilled. It has to be understood that electrical batteries are characterised by limited energy capac‐ ity (here around 10 kWh) and limitations for the operation according to the definition of FSoC (fractional state of charge: between 0 and 1) upper and lower limits. According to the difference between weather‐dependent renewable energy output prediction and real production, the ORC generation is adjusted taking into account the upper and lower power available referred to the maximum and minimum generation capacity of the ORC (here 4 kW minimum and 7 kW maximum).

## **7. Evaluation results**

The simulation‐based evaluation considers first results from the workload management experiments. The experimental set‐up involved allocating workload over a 48‐h period in a data centre using real VM resource utilization traces. Each VM was initially assigned to

**Figure 12.** Power consumption with different migration limits over 48‐h horizon.

a particular server as per the real traces without the Workload Allocation Optimisation GC controlling the initial assignment. The only influence on power consumption was through VM migrations and server consolidation.

#### **7.1. Workload allocation—VM migration limits**

The first experiments evaluated the impact of the migration limit on the workload allocation (without thermal priorities for servers). This baseline is a migration limit of 0, that is, each VM was run on the server it was initially assigned to. Following from there, a series of experiments were executed to evaluate various migration limits (from 1 to 100) as shown in **Figure 12**.

As expected, increasing the migration limit resulted in a considerable reduction of power con‐ sumption (see **Figure 12**). The largest migration limit tested (100 migrations per 10 min time period) required just a few time periods to achieve a reduction from approximately 11 kW to just over 4 kW. Indeed, the average hourly energy consumption of the IT equipment was 6.71 kWh less with a migration limit of 100 than with the baseline. The figure for IT power consumption (see **Figure 12**) further illustrates that all positive migration limits tended to this equilibrium state, with a migration limit of 10 reaching the 4 kW mark in less than 9 h and the limit of 5 requiring approximately 24 h. Once reached, the variations in power consumption between the migration limits were minor. This means that if the workload allocator had con‐ trolled the initial assignment of VMs to servers, then a migration limit of 10 or even 5 would have been sufficient to achieve similar savings as with a limit of 100.

#### **7.2. Workload allocation—thermal preferences**

The experiments described in the following were performed under identical settings to those previously discussed with the exception that each server had an associated thermal prefer‐ ence, thereby allowing a proper ranking of servers. The thermal preference was used to rank the servers for consolidation.

**Figure 13.** Workload distribution per third of rack.

In addition to the baseline described in the previous section, experiments were executed to assess power consumption with and without thermal preferences for migration limits of 10 and 100. The experiments showed that there is little difference in the total IT power consump‐ tion for the thermally ranked server consolidation, while HVAC energy consumption was reduced by approximately 20 kWh over the 48‐h period relative to the baseline approach, and by 6.5 kWh compared to the scenario with 100 migrations and no thermal preference.

The behaviour of the scenarios with thermal preference can be better understood when anal‐ ysed at the third of rack level (top, middle and bottom boxes) as shown in **Figure 13**. As can be observed, the only servers that were used by the GENiC energy management platform were those at the bottom level of three racks: B1, B3 and B4. The loads from all the other servers were migrated to servers in these locations and then servers that lost IT load were powered off, as can be seen from the power value for the scenario with thermal preference and limit of 100 migrations (bottom graph in **Figure 13**).

Finally, **Figure 14** presents the temperature distribution of the case study data centre C130 for (a) the thermal preference with 100 migrations and (b) the baseline. The baseline study indi‐ cates risks of a hot spot at the top layer of the last rack in row B. The supply air temperature is around 18°C; however, the inlet temperature of the particular box is approximately 23°C. The rise of temperature is due to infiltration of hot air from the hot aisle to the cold aisle space. The optimized workload allocation with thermal preference scenario ensures that the airflow

**Figure 14.** Temperature distribution for (a) thermal preference and (b) baseline.

will use the shortest path from the cold air supply to the heat source. The cold air is taken by preferable servers in the bottom boxes. The typical cold aisle‐hot aisle distribution can be observed in this case. The inlet temperature of all active servers is approximately 18°C. This evaluation shows that the developed energy management platform can balance the tempera‐ ture distribution in a data centre in such a manner as to avoid hot spots without the need for extensive structural changes to the cooling layout, for example, hot aisle containment.

## **8. Conclusions**

In this chapter, an architecture for an integrated energy management system for data centres was presented. The architecture and prototype implementation was developed within the European Commission funded GENiC project. The proposed system combines optimisation of energy consumption by encapsulating monitoring and control of IT workload, data centre cooling, local power generation and waste heat recovery. The project conducted an initial evaluation of the platform in terms of IT workload, thermal and power management based on a simulation model of a real data centre. The initial simulation‐based assessment was chosen by the project for a number of reasons. It allows evaluating the performance of management and control algorithms before deployment in the real data centre space. Secondly, the archi‐ tecture of the platform is designed such that the system interacts with the simulated data centre in the same manner as it interacts with the components in a real data centre, allow‐ ing also the testing and commissioning of novel management and control concepts before deployment in target space. The specific algorithms developed in the GENiC project attempt to optimise strategies focused on workload, thermal and power management in a data centre. The optimisation occurs at different time horizons, short‐term predictions are generated to support actuation decisions that are made within each of the mentioned management groups, and long‐term predictions supporting decision‐making at the supervisory level (coordinating management groups). The evaluation presented in this chapter focused on an initial analysis of workload and thermal management techniques. The operation strategies applied by the Workload Allocation Optimisation GC prove significant savings potential (of up to 40%) in terms of total energy consumption. This reduction is achieved through the optimization of the allocation strategy of Virtual Machines (VMs) while switching off unused servers. The performance of the Workload Allocation Optimisation GC shows a more effective utilization of the data centre with the same number of processed IT jobs. The GENiC project will replace the simulation environment by a real physical data centre for the final evaluation and dem‐ onstration of the developed management algorithms and strategies in a real‐world setting.

## **Acknowledgements**

The authors acknowledge the European Commission's 7th Framework Programme in part funding the work reported here under Grant No. 608826.

# **Author details**

Dirk Pesch<sup>1</sup> \*, Susan Rea<sup>1</sup> , J. Ignacio Torrens<sup>2</sup> , Vojtech Zavrel<sup>2</sup> , J.L.M. Hensen<sup>2</sup> , Diarmuid Grimes<sup>3</sup> , Barry O'Sullivan<sup>3</sup> , Thomas Scherer<sup>4</sup> , Robert Birke<sup>4</sup> , Lydia Chen<sup>4</sup> , Ton Engbersen<sup>4</sup> , Lara Lopez<sup>5</sup> , Enric Pages<sup>5</sup> , Deepak Mehta<sup>6</sup> , Jacinta Townley<sup>6</sup> and Vassilios Tsachouridis<sup>6</sup>

\*Address all correspondence to: dirk.pesch@cit.ie


## **References**

